Craighead County
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Utah (0.04)
- (4 more...)
Generative Data Imputation for Sparse Learner Performance Data Using Generative Adversarial Imputation Networks
Zhang, Liang, Lin, Jionghao, Sabatini, John, Zapata-Rivera, Diego, Forsyth, Carol, Jiang, Yang, Hollander, John, Hu, Xiangen, Graesser, Arthur C.
DV ANCEMENTS in AI-driven technologies have significantly enhanced modern education through personalized tutoring and adaptive learning strategies on online platforms [1], [2]. Intelligent T utoring Systems (ITSs) exemplify this progress by leveraging advanced machine learning and natural language processing models to create interactive learning environments that improve outcomes across domains like literacy [3], mathematics [4], language learning [5], biology [6] and other STEM fields [7]. As human learners interact with ITSs, often through question-and-answer scenarios with immediate responses, their performance data becomes crucial for learner modeling, enabling systems to track progress, predict future performance, and adapt instruction accordingly [8]. Learner models like Bayesian Knowledge Tracing (BKT) and other knowledge tracing variants utilize the learner performance data to uncover learning characteristics, estimate knowledge states and acquisition [9]. However, in real-world scenarios, missing learner performance data is prevalent due to factors, such as learner dropout or disengagement [10], technical issues or incomplete data logging [11], biased sampling within experimental groups [12], and more. These challenges often lead to sparse data, where items (i.e., questions or problems) remain unattempted (e.g., learners may bypass the question, leave it unanswered due to a lack of response initiation, or make no attempt to engage with it), alongside limited learner interactions [13], [14]. As shown in Figure 1, missing performance records can occur along both the attempt and question dimensions during learner-ITS interactions. In the right portion of the figure's two matrices, entries marked with "?
- Asia > China > Hong Kong (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- Oceania > Australia (0.04)
- (12 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Education > Educational Technology > Educational Software > Computer Based Training (1.00)
- Education > Educational Setting > Online (1.00)
- Education > Assessment & Standards (1.00)
- Education > Curriculum > Subject-Specific Education (0.86)
Exploring the Potential of Large Language Models for Estimating the Reading Comprehension Question Difficulty
Jain, Yoshee, Hollander, John, He, Amber, Tang, Sunny, Zhang, Liang, Sabatini, John
Reading comprehension is a key for individual success, yet the assessment of question difficulty remains challenging due to the extensive human annotation and large-scale testing required by traditional methods such as linguistic analysis and Item Response Theory (IRT). While these robust approaches provide valuable insights, their scalability is limited. There is potential for Large Language Models (LLMs) to automate question difficulty estimation; however, this area remains underexplored. Our study investigates the effectiveness of LLMs, specifically OpenAI's GPT-4o and o1, in estimating the difficulty of reading comprehension questions using the Study Aid and Reading Assessment (SARA) dataset. We evaluated both the accuracy of the models in answering comprehension questions and their ability to classify difficulty levels as defined by IRT. The results indicate that, while the models yield difficulty estimates that align meaningfully with derived IRT parameters, there are notable differences in their sensitivity to extreme item characteristics. These findings suggest that LLMs can serve as the scalable method for automated difficulty assessment, particularly in dynamic interactions between learners and Adaptive Instructional Systems (AIS), bridging the gap between traditional psychometric techniques and modern AIS for reading comprehension and paving the way for more adaptive and personalized educational assessments. The manuscript has been accepted for presentation at the 27th International Conference on Human-Computer Interaction in Gothenburg, Sweden, from June 22-27, 2025.
- Europe > Sweden > Vaestra Goetaland > Gothenburg (0.24)
- North America > United States > Illinois > Champaign County > Urbana (0.14)
- North America > United States > Tennessee > Shelby County > Memphis (0.05)
- (2 more...)
Data Augmentation for Sparse Multidimensional Learning Performance Data Using Generative AI
Zhang, Liang, Lin, Jionghao, Sabatini, John, Borchers, Conrad, Weitekamp, Daniel, Cao, Meng, Hollander, John, Hu, Xiangen, Graesser, Arthur C.
Learning performance data describe correct and incorrect answers or problem-solving attempts in adaptive learning, such as in intelligent tutoring systems (ITSs). Learning performance data tend to be highly sparse (80\%\(\sim\)90\% missing observations) in most real-world applications due to adaptive item selection. This data sparsity presents challenges to using learner models to effectively predict future performance explore new hypotheses about learning. This article proposes a systematic framework for augmenting learner data to address data sparsity in learning performance data. First, learning performance is represented as a three-dimensional tensor of learners' questions, answers, and attempts, capturing longitudinal knowledge states during learning. Second, a tensor factorization method is used to impute missing values in sparse tensors of collected learner data, thereby grounding the imputation on knowledge tracing tasks that predict missing performance values based on real observations. Third, a module for generating patterns of learning is used. This study contrasts two forms of generative Artificial Intelligence (AI), including Generative Adversarial Networks (GANs) and Generate Pre-Trained Transformers (GPT) to generate data associated with different clusters of learner data. We tested this approach on an adult literacy dataset from AutoTutor lessons developed for Adult Reading Comprehension (ARC). We found that: (1) tensor factorization improved the performance in tracing and predicting knowledge mastery compared with other knowledge tracing techniques without data augmentation, showing higher relative fidelity for this imputation method, and (2) the GAN-based simulation showed greater overall stability and less statistical bias based on a divergence evaluation with varying simulation sample sizes compared to GPT.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- Asia > China > Hong Kong (0.04)
- Oceania > Australia (0.04)
- (9 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Instructional Material (1.00)
- Education > Educational Technology > Educational Software > Computer Based Training (1.00)
- Education > Educational Setting (1.00)
WavePulse: Real-time Content Analytics of Radio Livestreams
Mittal, Govind, Gupta, Sarthak, Wagle, Shruti, Chopra, Chirag, DeMattee, Anthony J, Memon, Nasir, Ahamad, Mustaque, Hegde, Chinmay
Radio remains a pervasive medium for mass information dissemination, with AM/FM stations reaching more Americans than either smartphone-based social networking or live television. Increasingly, radio broadcasts are also streamed online and accessed over the Internet. We present WavePulse, a framework that records, documents, and analyzes radio content in real-time. While our framework is generally applicable, we showcase the efficacy of WavePulse in a collaborative project with a team of political scientists focusing on the 2024 Presidential Elections. We use WavePulse to monitor livestreams of 396 news radio stations over a period of three months, processing close to 500,000 hours of audio streams. These streams were converted into time-stamped, diarized transcripts and analyzed to track answer key political science questions at both the national and state levels. Our analysis revealed how local issues interacted with national trends, providing insights into information flow. Our results demonstrate WavePulse's efficacy in capturing and analyzing content from radio livestreams sourced from the Web. Code and dataset can be accessed at \url{https://wave-pulse.io}.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- North America > United States > New York > Kings County > New York City (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (215 more...)
- Media > Radio (1.00)
- Leisure & Entertainment (1.00)
- Government > Voting & Elections (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
Do We Need iPhone Moment or Xiaomi Moment for Robots? Design of Affordable Home Robots for Health Monitoring
Wei, Bo, Bian, Yaya, Gao, Mingcen
In this paper, we study cost-effective home robot solutions which are designed for home health monitoring. The recent advancements in Artificial Intelligence (AI) have significantly advanced the capabilities of the robots, enabling them to better and efficiently understand and interact with their surroundings. The most common robots currently used in homes are toy robots and cleaning robots. While these are relatively affordable, their functionalities are very limited. On the other hand, humanoid and quadruped robots offer more sophisticated features and capabilities, albeit at a much higher cost. Another category is educational robots, which provide educators with the flexibility to attach various sensors and integrate different design methods with the integrated operating systems. However, the challenge still exists in bridging the gap between affordability and functionality. Our research aims to address this by exploring the potential of developing advanced yet affordable and accessible robots for home robots, aiming for health monitoring, by using edge computing techniques and taking advantage of existing computing resources for home robots, such as mobile phones.
- Europe > United Kingdom > England > Tyne and Wear > Newcastle (0.05)
- North America > United States > California > San Francisco County > San Francisco (0.05)
- North America > United States > Arkansas > Craighead County > Jonesboro (0.05)
Benchmarking Edge AI Platforms for High-Performance ML Inference
Jayanth, Rakshith, Gupta, Neelesh, Prasanna, Viktor
Edge computing's growing prominence, due to its ability to reduce communication latency and enable real-time processing, is promoting the rise of high-performance, heterogeneous System-on-Chip solutions. While current approaches often involve scaling down modern hardware, the performance characteristics of neural network workloads on these platforms can vary significantly, especially when it comes to parallel processing, which is a critical consideration for edge deployments. To address this, we conduct a comprehensive study comparing the latency and throughput of various linear algebra and neural network inference tasks across CPU-only, CPU/GPU, and CPU/NPU integrated solutions. {We find that the Neural Processing Unit (NPU) excels in matrix-vector multiplication (58.6% faster) and some neural network tasks (3.2$\times$ faster for video classification and large language models). GPU outperforms in matrix multiplication (22.6% faster) and LSTM networks (2.7$\times$ faster) while CPU excels at less parallel operations like dot product. NPU-based inference offers a balance of latency and throughput at lower power consumption. GPU-based inference, though more energy-intensive, performs best with large dimensions and batch sizes. We highlight the potential of heterogeneous computing solutions for edge AI, where diverse compute units can be strategically leveraged to boost accurate and real-time inference.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Arkansas > Craighead County > Jonesboro (0.04)
- Information Technology (1.00)
- Semiconductors & Electronics (0.67)
Anchor-Free Correlated Topic Modeling: Identifiability and Algorithm
In topic modeling, many algorithms that guarantee identifiability of the topics have been developed under the premise that there exist anchor words - i.e., words that only appear (with positive probability) in one topic. Follow-up work has resorted to three or higher-order statistics of the data corpus to relax the anchor word assumption. Reliable estimates of higher-order statistics are hard to obtain, however, and the identification of topics under those models hinges on uncorrelatedness of the topics, which can be unrealistic. This paper revisits topic modeling based on second-order moments, and proposes an anchor-free topic mining framework. The proposed approach guarantees the identification of the topics under a much milder condition compared to the anchor-word assumption, thereby exhibiting much better robustness in practice. The associated algorithm only involves one eigendecomposition and a few small linear programs. This makes it easy to implement and scale up to very large problem instances. Experiments using the TDT2 and Reuters-21578 corpus demonstrate that the proposed anchor-free approach exhibits very favorable performance (measured using coherence, similarity count, and clustering accuracy metrics) compared to the prior art.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Utah (0.04)
- (4 more...)
De-identification of clinical free text using natural language processing: A systematic review of current approaches
Kovačević, Aleksandar, Bašaragin, Bojana, Milošević, Nikola, Nenadić, Goran
Background: Electronic health records (EHRs) are a valuable resource for data-driven medical research. However, the presence of protected health information (PHI) makes EHRs unsuitable to be shared for research purposes. De-identification, i.e. the process of removing PHI is a critical step in making EHR data accessible. Natural language processing has repeatedly demonstrated its feasibility in automating the de-identification process. Objectives: Our study aims to provide systematic evidence on how the de-identification of clinical free text has evolved in the last thirteen years, and to report on the performances and limitations of the current state-of-the-art systems. In addition, we aim to identify challenges and potential research opportunities in this field. Methods: A systematic search in PubMed, Web of Science and the DBLP was conducted for studies published between January 2010 and February 2023. Titles and abstracts were examined to identify the relevant studies. Selected studies were then analysed in-depth, and information was collected on de-identification methodologies, data sources, and measured performance. Results: A total of 2125 publications were identified for the title and abstract screening. 69 studies were found to be relevant. Machine learning (37 studies) and hybrid (26 studies) approaches are predominant, while six studies relied only on rules. Majority of the approaches were trained and evaluated on public corpora. The 2014 i2b2/UTHealth corpus is the most frequently used (36 studies), followed by the 2006 i2b2 (18 studies) and 2016 CEGS N-GRID (10 studies) corpora.
- North America > United States > South Carolina > Charleston County > Charleston (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Missouri > Boone County > Columbia (0.14)
- (44 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Overview (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
The Two Faces of AI in Green Mobile Computing: A Literature Review
Siemers, Wander, Sallou, June, Cruz, Luís
Artificial intelligence is bringing ever new functionalities to the realm of mobile devices that are now considered essential (e.g., camera and voice assistants, recommender systems). Yet, operating artificial intelligence takes up a substantial amount of energy. However, artificial intelligence is also being used to enable more energy-efficient solutions for mobile systems. Hence, artificial intelligence has two faces in that regard, it is both a key enabler of desired (efficient) mobile functionalities and a major power draw on these devices, playing a part in both the solution and the problem. In this paper, we present a review of the literature of the past decade on the usage of artificial intelligence within the realm of green mobile computing. From the analysis of 34 papers, we highlight the emerging patterns and map the field into 13 main topics that are summarized in details. Our results showcase that the field is slowly increasing in the past years, more specifically, since 2019. Regarding the double impact AI has on the mobile energy consumption, the energy consumption of AI-based mobile systems is under-studied in comparison to the usage of AI for energy-efficient mobile computing, and we argue for more exploratory studies in that direction. We observe that although most studies are framed as solution papers (94%), the large majority do not make those solutions publicly available to the community. Moreover, we also show that most contributions are purely academic (28 out of 34 papers) and that we need to promote the involvement of the mobile software industry in this field.
- North America > United States > Arkansas > Craighead County > Jonesboro (0.04)
- Europe > United Kingdom > England > Staffordshire > Keele (0.04)
- Europe > Sweden > Skåne County > Malmö (0.04)
- Europe > Netherlands > South Holland > Delft (0.04)
- Overview (1.00)
- Research Report > New Finding (0.88)
- Energy (1.00)
- Information Technology > Software (0.34)